Skip to content

Conversation

@yiyixuxu
Copy link
Collaborator

@yiyixuxu yiyixuxu commented Nov 22, 2025

https://huggingface.co/collections/hunyuanvideo-community/hunyuanvideo-15

testing script

import torch

dtype = torch.bfloat16
device = "cuda:0"
from diffusers import HunyuanVideo15Pipeline, HunyuanVideo15ImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

t2v_names = ["480p_t2v", "720p_t2v", "480p_t2v_distilled"]
num_frames = 31  # use a minimum number for testing, 121 is default

# test t2v
prompt="A close-up shot captures a scene on a polished, light-colored granite kitchen counter, illuminated by soft natural light from an unseen window. Initially, the frame focuses on a tall, clear glass filled with golden, translucent apple juice standing next to a single, shiny red apple with a green leaf still attached to its stem. The camera moves horizontally to the right. As the shot progresses, a white ceramic plate smoothly enters the frame, revealing a fresh arrangement of about seven or eight more apples, a mix of vibrant reds and greens, piled neatly upon it. A shallow depth of field keeps the focus sharply on the fruit and glass, while the kitchen backsplash in the background remains softly blurred. The scene is in a realistic style."
seed = 1
for name in t2v_names:
    print(f"Testing {name}...")
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats()

    pipe = HunyuanVideo15Pipeline.from_pretrained(f"hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-{name}", torch_dtype=dtype)
    pipe.enable_model_cpu_offload()
    pipe.vae.enable_tiling()

    generator = torch.Generator(device=device).manual_seed(seed)
    video = pipe(
        prompt=prompt,
        generator=generator,
        num_frames=num_frames,
        num_inference_steps=50,
    ).frames[0]
    export_to_video(video, f"yiyi_test_hy15_{name}_output.mp4", fps=24)
    max_allocated = torch.cuda.max_memory_allocated() / 1024**3  # GB
    print(f"Max Allocated Memory: {max_allocated:.2f} GB")
    
# test i2v
i2v_names = ["480p_i2v", "720p_i2v", "480p_i2v_distilled", "720p_i2v_distilled"]

image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG")
prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
seed = 1
for name in i2v_names:
    print(f"Testing {name}...")
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats()

    pipe = HunyuanVideo15ImageToVideoPipeline.from_pretrained(f"hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-{name}", torch_dtype=dtype)
    pipe.enable_model_cpu_offload()
    pipe.vae.enable_tiling()

    generator = torch.Generator(device=device).manual_seed(seed)
    video = pipe(
        prompt=prompt,
        generator=generator,
        image=image,
        num_frames=num_frames,
        num_inference_steps=50,
    ).frames[0]
    export_to_video(video, f"yiyi_test_hy15_{name}_output.mp4", fps=24)
    max_allocated = torch.cuda.max_memory_allocated() / 1024**3  # GB
    print(f"Max Allocated Memory: {max_allocated:.2f} GB")

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@tin2tin
Copy link

tin2tin commented Nov 28, 2025

Thank you for working on this. Eagerly awaiting this one (Wan doesn't work for me).

@yiyixuxu
Copy link
Collaborator Author

@tin2tin
i will merge this soon, do you want to test it out? all the checkpoints are uploaded, you can find scripts in the PR description

@tin2tin
Copy link

tin2tin commented Nov 29, 2025

I don't have time right now, but I'll definitely check it out later.

@tin2tin
Copy link

tin2tin commented Nov 29, 2025

720p_t2v seems to be loading, but it's too heavy for me to run (using your example code) - it ended with a crash.

image

480p_t2v
image

HunyuanVideo 1.5 runs just fine on my setup in ComfyUI, so they must have found out how to optimize it seriously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants